Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for latency reduction in Product Quantization #397

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AbhijitKulkarni1
Copy link

This PR optimizes the io.github.jbellis.jvector.quantization.KMeansPlusPlusClusterer#chooseInitialCentroids implementation using Java Vector API for the following cases.

  1. Calculate minimums in a vectorized way instead of scalar fashion one at a time in loop.
  2. Optimize scalar addition using vectorization.

Tested the optimized code with the setup as below and observed latency reduction of ~6% in Product Quantization for ada002-100k dataset execution.

Run Setup:
Jvector release used: 4.0.0-beta.1
JDK used:
openjdk version "23.0.1" 2024-10-15
OpenJDK Runtime Environment (build 23.0.1+11-39)
OpenJDK 64-Bit Server VM (build 23.0.1+11-39, mixed mode, sharing)

Socket 1 of m7i.metal-48xl machine used

To check the benefit of this optimization reasonably, following changes were done in Bench.java, PQParameters.java and Grid.java:

  1. In buildCompression and searchCompression parameters, enabled only PQ parameters.
  2. Disabled the caching for PQParameters.
  3. Disabled the testConfiguration calls, to measure latency only for quantization and Index building. Observed Index building time is almost same in both baseline and optimized runs:
    io/github/jbellis/jvector/example/Grid.java:141 --> /*indexes.forEach((features, index) -> { ...});*/

@marianotepper marianotepper self-requested a review February 28, 2025 18:24
@marianotepper
Copy link
Collaborator

marianotepper commented Feb 28, 2025

Thank you for your contribution! This PR looks really good.

I have a couple of minor comments and suggestions.

For helping with code readability, I have a few suggestions for placing the minInPlace in the following files:

  • VectorUtil.java -> line 160
  • VectorUtilSupport.java -> line 83
  • DefaultVectorUtilSupport.java -> line 289
  • NativeVectorUtilSupport.java -> line 118
  • PanamaVectorUtilSupport.java -> line 107
  • SimdOps.java -> line 614

No need to change the contents of the function or its signature, just its line placement. This way, it is closer to functions that are somewhat related.

In line 218 of KMeansPlusPlusClusterer.java, it seems like zeroing the vector makes no difference as we are replacing every value in the next loop. Is this correct?

@AbhijitKulkarni1
Copy link
Author

Thank you for your contribution! This PR looks really good.

I have a couple of minor comments and suggestions.

For helping with code readability, I have a few suggestions for placing the minInPlace in the following files:

  • VectorUtil.java -> line 160
  • VectorUtilSupport.java -> line 83
  • DefaultVectorUtilSupport.java -> line 289
  • NativeVectorUtilSupport.java -> line 118
  • PanamaVectorUtilSupport.java -> line 107
  • SimdOps.java -> line 614

No need to change the contents of the function or its signature, just its line placement. This way, it is closer to functions that are somewhat related.

In line 218 of KMeansPlusPlusClusterer.java, it seems like zeroing the vector makes no difference as we are replacing every value in the next loop. Is this correct?

Thanks for your suggestions, I will update accordingly. Yes, you are correct about line 218 of KMeansPlusPlusClusterer.java, for cleaner implementation had set to zero. Please let me know if you are of the opinion to remove it?

@sam-herman
Copy link
Contributor

sam-herman commented Mar 1, 2025

@marianotepper @AbhijitKulkarni1 if this helps, I added an index construction and PQ benchmarks you can use to test the results.
#398

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants